Splat Provides Programmers a Fast and Accurate Study of Memory Behavior without the Necessity of a Costly Memory Simulator. the Tool Is Suitable for Use as a Step in an Iterative Optimization

نویسندگان

  • Jesús Sánchez
  • Antonio González
چکیده

Memory performance is becoming a major bottleneck in current microprocessors. A great deal of research has aimed at developing techniques for improving memory performance. Some of these techniques rely on hardware alone, but many require programmer or compiler support. Examples of the latter are software prefetching, blocking, and copying. To use these techniques effectively, the programmer must have some knowledge of the program’s behavior. For instance, prefetching is useful only if it is limited to instructions that frequently produce cache misses. Adding a prefetch instruction to every memory instruction could result in significant performance degradation. These techniques might also require quantification of the different types of cache misses (see sidebar, page 60). For instance, microprocessors can avoid compulsory misses through both hardware and software prefetching. Blocking, or tiling, is a method of avoiding capacity misses; copying and padding are techniques for reducing the effect of conflict misses. Many processors provide hints in their memory instructions that the compiler can use for optimizing memory performance. Examples of such hints are the PowerPC’s cache bypass facility and the hints incorporated by the IA-64 instruction set. Effective use of these hints requires information about the program’s locality behavior. The process of obtaining information about a program’s locality characteristics is data locality analysis. Traditionally, this analysis takes place either at compile time or at runtime. The former approach incurs low overhead but is relatively inaccurate because the compiler lacks some information. The runtime approach usually takes the form of a memory hierarchy simulation, which is quite accurate but very slow. In this article, we introduce SPLAT (Static and Profiled Data Locality Analysis Tool). The tool’s purpose is to provide a fast study of memory behavior without the necessity of a costly memory simulator. SPLAT consists of a static locality analysis enhanced by simple profiling data. Its overhead is low because it performs most of the analysis at compile time, and because the required profiling support is just a basic-block-execution count. Many commercial compilers support this profiling option. Compared with simulation techniques, SPLAT’s estimation technique is highly accurate for numeric codes. The tool is useful not only for compilers but also for programmers. To tune a program, programmers should know its performance, the Jesús Sánchez Antonio González

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A NEW TWO STEP CLASS OF METHODS WITH MEMORY FOR SOLVING NONLINEAR EQUATIONS WITH HIGH EFFICIENCY INDEX

It is attempted to extend a two-step without memory method to it's with memory. Then, a new two-step derivative free class of without memory methods, requiring three function evaluations per step, is suggested by using a convenient weight function for solving nonlinear equations. Eventually, we obtain a new class of methods by employing a self-accelerating parameter calculated in each iterative...

متن کامل

Fast Reconstruction of SAR Images with Phase Error Using Sparse Representation

In the past years, a number of algorithms have been introduced for synthesis aperture radar (SAR) imaging. However, they all suffer from the same problem: The data size to process is considerably large. In recent years, compressive sensing and sparse representation of the signal in SAR has gained a significant research interest. This method offers the advantage of reducing the sampling rate, bu...

متن کامل

Design of a Multiplier for Similar Base Numbers Without Converting Base Using a Data Oriented Memory

One the challenging in hardware performance is to designing a high speed calculating unit. The higher of calculations speeds in a computer system  will be pointed out in terms of performance. As a result, designing a high speed calculating unit is of utmost importance. In this paper, we start design whit this knowledge that one multiplier made of several adder and one divider made of several su...

متن کامل

Chaotic Genetic Algorithm based on Explicit Memory with a new Strategy for Updating and Retrieval of Memory in Dynamic Environments

Many of the problems considered in optimization and learning assume that solutions exist in a dynamic. Hence, algorithms are required that dynamically adapt with the problem’s conditions and search new conditions. Mostly, utilization of information from the past allows to quickly adapting changes after. This is the idea underlining the use of memory in this field, what involves key design issue...

متن کامل

A new iterative with memory class for solving nonlinear ‎equations‎

In this work we develop a new optimal without memory class for approximating a simple root of a nonlinear equation. This class includes three parameters. Therefore, we try to derive some with memory methods so that the convergence order increases as high as possible. Some numerical examples are also ‎presented.‎‎

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000